Definition (Shattering)  A hypothesis class H H H C ⊂ X C \subset X C ⊂ X H H H C C C C C C { 0 , 1 } \{0, 1\} { 0 , 1 } ∣ H C ∣ = 2 ∣ C ∣ |H_C| = 2^{|C|} ∣ H C  ∣ = 2 ∣ C ∣ 
Corollary  Let H H H X → { 0 , 1 } X \to \{0, 1\} X → { 0 , 1 } C ⊂ X C \subset X C ⊂ X 2 m 2m 2 m H H H A A A D D D X × { 0 , 1 } X \times \{0, 1\} X × { 0 , 1 } h ∈ H h \in H h ∈ H L D ( h ) = 0 L_D(h) = 0 L D  ( h ) = 0 1 / 7 1/7 1/7 S ∼ D m S \sim D^m S ∼ D m L D ( A ( S ) ) ≥ 1 / 8 L_D(A(S)) \ge 1/8 L D  ( A ( S )) ≥ 1/8  Philosophically, if someone can explain every phenomoenon, his explanations are worthless 
Definition (VC Dimension)  The VC-dimension of a hypothesis class H H H V C d i m ( H ) VCdim(H) V C d im ( H ) C ⊂ X C \subset X C ⊂ X H H H H H H H H H 
Theorem  Let H H H  
Proof  Since H H H 
How to show V C d i m ( H ) = d VCdim(H) = d V C d im ( H ) = d   There exists a set C C C d d d H H H  Every set C C C d + 1 d+1 d + 1 H H H  Example for finite classes  For any finite hypothesis class H H H V C d i m ( H ) ≤ log  ∣ H ∣ VCdim(H) \le \log{|H|} V C d im ( H ) ≤ log  ∣ H ∣ C , ∣ H C ∣ ≤ ∣ H ∣ C, |H_C| \le |H| C , ∣ H C  ∣ ≤ ∣ H ∣ 2 ∣ C ∣ > ∣ H ∣ 2|C| > |H| 2∣ C ∣ > ∣ H ∣ C C C 
Will consider adding some more examples.
Definition (Growth Function)  Let H H H H H H τ H : N → N \tau_H : \mathbb{N} \to \mathbb{N} τ H  : N → N 
τ H ( m ) = max  C ⊂ X : ∣ C ∣ = m ∣ H C ∣ . \begin{gather*} \tau_H(m)=\max_{C \subset X:|C|=m} |H_C|. \end{gather*} τ H  ( m ) = C ⊂ X : ∣ C ∣ = m max  ∣ H C  ∣.  Sauer's Lemma  Let H H H V C d i m ( H ) ≤ d < ∞ VCdim(H) \le d < \infty V C d im ( H ) ≤ d < ∞ m , τ H ( m ) ≤ ∑ i = 0 d ( m i ) m, \tau_H(m) \le \sum_{i=0}^d {m \choose i} m , τ H  ( m ) ≤ ∑ i = 0 d  ( i m  ) m > d + 1 m > d + 1 m > d + 1 τ H ( m ) ≤ ( e m / d ) d \tau_H(m) \le (em/d)^d τ H  ( m ) ≤ ( e m / d ) d  
Proof  ( k j ) = k ! j ! ( k − 1 ) ! = k ( k − 1 ) ⋯ ( k − j + 1 ) j ! = k d k − 1 d ⋯ k − j + 1 d d j j ! < ( k d ) j d j j ! ≤ ( k d ) d d j j ! \begin{align*} {k \choose j} & = {k! \over j!(k-1)!} \\ & = {k(k-1)\cdots (k-j+1) \over j!} \\ & = {k \over d}{k-1 \over d} \cdots {k-j+1 \over d}{d^j \over j!} \\ & < ({k \over d})^j {d^j \over j!} \\ & \le ({k \over d})^d {d^j \over j!} \end{align*} ( j k  )  = j ! ( k − 1 )! k !  = j ! k ( k − 1 ) ⋯ ( k − j + 1 )  = d k  d k − 1  ⋯ d k − j + 1  j ! d j  < ( d k  ) j j ! d j  ≤ ( d k  ) d j ! d j   Hence,
∑ j = 0 d ( k j ) ≤ ∑ j = 0 d ( k d ) d d j j ! ≤ ( k d ) d ∑ j = 0 ∞ d j j ! = ( e k d ) d \begin{align*} \sum_{j=0}^d {k \choose j} & \le \sum_{j=0}^d ({k \over d})^d {d^j \over j!} \\ & \le ({k \over d})^d \sum_{j=0}^\infty {d^j \over j!} \\ & = ({ek \over d})^d \end{align*} j = 0 ∑ d  ( j k  )  ≤ j = 0 ∑ d  ( d k  ) d j ! d j  ≤ ( d k  ) d j = 0 ∑ ∞  j ! d j  = ( d e k  ) d  Theorem  Let H H H τ H \tau_H τ H  D D D δ ∈ ( 0 , 1 ) \delta \in (0, 1) δ ∈ ( 0 , 1 ) 1 − δ 1-\delta 1 − δ S ∼ D m S \sim D^m S ∼ D m  
∣ L D ( h ) − L S ( h ) ∣ ≤ 4 + log  ( τ H ( 2 m ) ) δ 2 m \begin{gather*} |L_D(h) - L_S(h)| \le {4+\sqrt{\log{(\tau_H(2m))}} \over \delta\sqrt{2m}} \end{gather*} ∣ L D  ( h ) − L S  ( h ) ∣ ≤ δ 2 m  4 + log  ( τ H  ( 2 m ))    For simplicity assume that d log  ( 2 e m / d ) ≥ 4 \sqrt{d\log{(2em/d)}} \ge 4 d log  ( 2 e m / d )  ≥ 4 
∣ L D ( h ) − L S ( h ) ∣ ≤ 1 δ 2 d log  ( 2 e m / d ) m \begin{gather*} |L_D(h) - L_S(h)| \le {1 \over \delta} \sqrt{{2d\log{(2em/d)} \over m}} \end{gather*} ∣ L D  ( h ) − L S  ( h ) ∣ ≤ δ 1  m 2 d log  ( 2 e m / d )    Proof  We will start by showing that
E S ∼ D m [ sup  h ∈ H ∣ L D ( h ) − L S ( h ) ∣ ] ≤ 4 + log  ( τ H ( 2 m ) ) 2 m . \underset{S \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}}\left|L_{\mathcal{D}}(h)-L_S(h)\right|\right] \leq \frac{4+\sqrt{\log \left(\tau_{\mathcal{H}}(2 m)\right)}}{\sqrt{2 m}} . S ∼ D m E  [ h ∈ H sup  ∣ L D  ( h ) − L S  ( h ) ∣ ] ≤ 2 m  4 + log  ( τ H  ( 2 m ) )   . Since the random variable sup  h ∈ H ∣ L D ( h ) − L S ( h ) ∣ \sup _{h \in \mathcal{H}}\left|L_{\mathcal{D}}(h)-L_S(h)\right| sup h ∈ H  ∣ L D  ( h ) − L S  ( h ) ∣ 
To bound the left-hand side of the previous equation we first note that for every h ∈ H h \in \mathcal{H} h ∈ H L D ( h ) = E S ′ ∼ D m [ L S ′ ( h ) ] L_{\mathcal{D}}(h)=\mathbb{E}_{S^{\prime} \sim \mathcal{D}^m}\left[L_{S^{\prime}}(h)\right] L D  ( h ) = E S ′ ∼ D m  [ L S ′  ( h ) ] S ′ = z 1 ′ , … , z m ′ S^{\prime}=z_1^{\prime}, \ldots, z_m^{\prime} S ′ = z 1 ′  , … , z m ′  
E S ∼ D m [ sup  h ∈ H ∣ L D ( h ) − L S ( h ) ∣ ] = E S ∼ D m [ sup  h ∈ H ∣ E S ′ ∼ D m L S ′ ( h ) − L S ( h ) ∣ ] . \underset{S \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}}\left|L_{\mathcal{D}}(h)-L_S(h)\right|\right]=\underset{S \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}}\left|\underset{S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}} L_{S^{\prime}}(h)-L_S(h)\right|\right] . S ∼ D m E  [ h ∈ H sup  ∣ L D  ( h ) − L S  ( h ) ∣ ] = S ∼ D m E  [ h ∈ H sup  ∣ ∣  S ′ ∼ D m E  L S ′  ( h ) − L S  ( h ) ∣ ∣  ] . A generalization of the triangle inequality yields
∣ E S ′ ∼ D m [ L S ′ ( h ) − L S ( h ) ] ∣ ≤ E S ′ ∼ D m ∣ L S ′ ( h ) − L S ( h ) ∣ , \left|\underset{S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}}\left[L_{S^{\prime}}(h)-L_S(h)\right]\right| \leq \underset{S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}}\left|L_{S^{\prime}}(h)-L_S(h)\right|, ∣ ∣  S ′ ∼ D m E  [ L S ′  ( h ) − L S  ( h ) ] ∣ ∣  ≤ S ′ ∼ D m E  ∣ L S ′  ( h ) − L S  ( h ) ∣ , and the fact that supermum of expectation is smaller than expectation of supremum yields
sup  h ∈ H E S ′ ∼ D m ∣ L S ′ ( h ) − L S ( h ) ∣ ≤ E S ′ ∼ D m sup  h ∈ H ∣ L S ′ ( h ) − L S ( h ) ∣ . \sup _{h \in \mathcal{H}} \underset{S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}}\left|L_{S^{\prime}}(h)-L_S(h)\right| \leq \underset{S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}} \sup _{h \in \mathcal{H}}\left|L_{S^{\prime}}(h)-L_S(h)\right| . h ∈ H sup  S ′ ∼ D m E  ∣ L S ′  ( h ) − L S  ( h ) ∣ ≤ S ′ ∼ D m E  h ∈ H sup  ∣ L S ′  ( h ) − L S  ( h ) ∣ . Formally, the previous two inequalities follow from Jensen's inequality. Combining all we obtain
E S ∼ D m [ sup  h ∈ H ∣ L D ( h ) − L S ( h ) ∣ ] ≤ E S , S ′ ∼ D m [ sup  h ∈ H ∣ L S ′ ( h ) − L S ( h ) ∣ ] = E S , S ′ ∼ D m [ sup  h ∈ H 1 m ∣ ∑ i = 1 m ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) ∣ ] . (1) \begin{aligned} \underset{S \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}}\left|L_{\mathcal{D}}(h)-L_S(h)\right|\right] & \leq \underset{S, S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}}\left|L_{S^{\prime}}(h)-L_S(h)\right|\right] \\ &=\underset{S, S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}} \frac{1}{m}\left|\sum_{i=1}^m\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right)\right|\right] . \tag{1} \end{aligned} S ∼ D m E  [ h ∈ H sup  ∣ L D  ( h ) − L S  ( h ) ∣ ]  ≤ S , S ′ ∼ D m E  [ h ∈ H sup  ∣ L S ′  ( h ) − L S  ( h ) ∣ ] = S , S ′ ∼ D m E  [ h ∈ H sup  m 1  ∣ ∣  i = 1 ∑ m  ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) ∣ ∣  ] .  ( 1 ) The expectation on the right-hand side is over a choice of two i.i.d. samples S = z 1 , … , z m S=z_1, \ldots, z_m S = z 1  , … , z m  S ′ = z 1 ′ , … , z m ′ S^{\prime}=z_1^{\prime}, \ldots, z_m^{\prime} S ′ = z 1 ′  , … , z m ′  2 m 2 m 2 m z i z_i z i  z i ′ z_i^{\prime} z i ′  ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) \left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right) ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) − ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) -\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right) − ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) σ ∈ { ± 1 } m \boldsymbol{\sigma} \in\{\pm 1\}^m σ ∈ { ± 1 } m 
E S , S ′ ∼ D m [ sup  h ∈ H 1 m ∣ ∑ i = 1 m σ i ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) ∣ ] \underset{S, S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}} \frac{1}{m}\left|\sum_{i=1}^m \sigma_i\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right)\right|\right] S , S ′ ∼ D m E  [ h ∈ H sup  m 1  ∣ ∣  i = 1 ∑ m  σ i  ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) ∣ ∣  ] Since this holds for every σ ∈ { ± 1 } m \boldsymbol{\sigma} \in\{\pm 1\}^m σ ∈ { ± 1 } m σ \boldsymbol{\sigma} σ { ± 1 } \{\pm 1\} { ± 1 } U ± U_{\pm} U ±  
E σ ∼ U ± m E S , S ′ ∼ D m [ sup  h ∈ H 1 m ∣ ∑ i = 1 m σ i ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) ∣ ] , \underset{\sigma \sim U_{\pm}^m}{\mathbb{E}} \underset{S, S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}} \frac{1}{m}\left|\sum_{i=1}^m \sigma_i\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right)\right|\right], σ ∼ U ± m  E  S , S ′ ∼ D m E  [ h ∈ H sup  m 1  ∣ ∣  i = 1 ∑ m  σ i  ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) ∣ ∣  ] , and by the linearity of expectation it also equals
E S , S ′ ∼ D m E σ ∼ U ± m [ sup  h ∈ H 1 m ∣ ∑ i = 1 m σ i ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) ∣ ] . \underset{S, S^{\prime} \sim \mathcal{D}^m}{\mathbb{E}} \underset{\boldsymbol{\sigma} \sim U_{\pm}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}} \frac{1}{m}\left|\sum_{i=1}^m \sigma_i\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right)\right|\right] . S , S ′ ∼ D m E  σ ∼ U ± m  E  [ h ∈ H sup  m 1  ∣ ∣  i = 1 ∑ m  σ i  ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) ∣ ∣  ] . Next, fix S S S S ′ S^{\prime} S ′ C C C S S S S ′ S^{\prime} S ′ h ∈ H C h \in \mathcal{H}_C h ∈ H C  
E σ ∼ U ± m [ sup  h ∈ H 1 m ∣ ∑ i = 1 m σ i ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) ∣ ] = E σ ∼ U ± m [ max  h ∈ H C 1 m ∣ ∑ i = 1 m σ i ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) ∣ ] . \begin{aligned} &\underset{\boldsymbol{\sigma} \sim U_{\pm}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}} \frac{1}{m}\left|\sum_{i=1}^m \sigma_i\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right)\right|\right] \\ &=\underset{\sigma \sim U_{\pm}^m}{\mathbb{E}}\left[\max _{h \in \mathcal{H}_C} \frac{1}{m}\left|\sum_{i=1}^m \sigma_i\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right)\right|\right] . \end{aligned}  σ ∼ U ± m  E  [ h ∈ H sup  m 1  ∣ ∣  i = 1 ∑ m  σ i  ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) ∣ ∣  ] = σ ∼ U ± m  E  [ h ∈ H C  max  m 1  ∣ ∣  i = 1 ∑ m  σ i  ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) ∣ ∣  ] .  Fix some h ∈ H C h \in \mathcal{H}_C h ∈ H C  θ h = 1 m ∑ i = 1 m σ i ( ℓ ( h , z i ′ ) − ℓ ( h , z i ) ) \theta_h=\frac{1}{m} \sum_{i=1}^m \sigma_i\left(\ell\left(h, z_i^{\prime}\right)-\ell\left(h, z_i\right)\right) θ h  = m 1  ∑ i = 1 m  σ i  ( ℓ ( h , z i ′  ) − ℓ ( h , z i  ) ) E [ θ h ] = 0 \mathbb{E}\left[\theta_h\right]=0 E [ θ h  ] = 0 θ h \theta_h θ h  [ − 1 , 1 ] [-1,1] [ − 1 , 1 ] ρ > 0 \rho>0 ρ > 0 
P [ ∣ θ h ∣ > ρ ] ≤ 2 exp  ( − 2 m ρ 2 ) . \mathbb{P}\left[\left|\theta_h\right|>\rho\right] \leq 2 \exp \left(-2 m \rho^2\right) . P [ ∣ θ h  ∣ > ρ ] ≤ 2 exp ( − 2 m ρ 2 ) . Applying the union bound over h ∈ H C h \in \mathcal{H}_C h ∈ H C  ρ > 0 \rho>0 ρ > 0 
P [ max  h ∈ H C ∣ θ h ∣ > ρ ] ≤ 2 ∣ H C ∣ exp  ( − 2 m ρ 2 ) . \mathbb{P}\left[\max _{h \in \mathcal{H}_C}\left|\theta_h\right|>\rho\right] \leq 2\left|\mathcal{H}_C\right| \exp \left(-2 m \rho^2\right) . P [ h ∈ H C  max  ∣ θ h  ∣ > ρ ] ≤ 2 ∣ H C  ∣ exp ( − 2 m ρ 2 ) . Finally, the Lemma below tells us that the preceding implies
E [ max  h ∈ H C ∣ θ h ∣ ] ≤ 4 + log  ( ∣ H C ∣ ) 2 m . \mathbb{E}\left[\max _{h \in \mathcal{H}_C}\left|\theta_h\right|\right] \leq \frac{4+\sqrt{\log \left(\left|\mathcal{H}_C\right|\right)}}{\sqrt{2 m}} . E [ h ∈ H C  max  ∣ θ h  ∣ ] ≤ 2 m  4 + log  ( ∣ H C  ∣ )   . Combining all with the definition of τ H \tau_{\mathcal{H}} τ H  
E S ∼ D m [ sup  h ∈ H ∣ L D ( h ) − L S ( h ) ∣ ] ≤ 4 + log  ( τ H ( 2 m ) ) 2 m . \underset{S \sim \mathcal{D}^m}{\mathbb{E}}\left[\sup _{h \in \mathcal{H}}\left|L_{\mathcal{D}}(h)-L_S(h)\right|\right] \leq \frac{4+\sqrt{\log \left(\tau_{\mathcal{H}}(2 m)\right)}}{\sqrt{2 m}} . S ∼ D m E  [ h ∈ H sup  ∣ L D  ( h ) − L S  ( h ) ∣ ] ≤ 2 m  4 + log  ( τ H  ( 2 m ) )   . Lemma  Let X X X x ′ ∈ R x^{\prime} \in \mathbb{R} x ′ ∈ R a > 0 a>0 a > 0 b ≥ e b \geq e b ≥ e t ≥ 0 t \geq 0 t ≥ 0 P [ ∣ X − x ′ ∣ > \mathbb{P}\left[\left|X-x^{\prime}\right|>\right. P [ ∣ X − x ′ ∣ > t ] ≤ 2 b e − t 2 / a 2 t] \leq 2 b e^{-t^2 / a^2} t ] ≤ 2 b e − t 2 / a 2 E [ ∣ X − x ′ ∣ ] ≤ a ( 2 + log  ( b ) ) \mathbb{E}\left[\left|X-x^{\prime}\right|\right] \leq a(2+\sqrt{\log (b)}) E [ ∣ X − x ′ ∣ ] ≤ a ( 2 + log  ( b )  )  
Proof  For all i = 0 , 1 , 2 , … i=0,1,2, \ldots i = 0 , 1 , 2 , … t i = a ( i + log  ( b ) ) t_i=a(i+\sqrt{\log (b)}) t i  = a ( i + log  ( b )  ) t i t_i t i  
E [ ∣ X − x ′ ∣ ] ≤ a log  ( b ) + ∑ i = 1 ∞ t i P [ ∣ X − x ′ ∣ > t i − 1 ] \mathbb{E}\left[\left|X-x^{\prime}\right|\right] \leq a \sqrt{\log (b)}+\sum_{i=1}^{\infty} t_i \mathbb{P}\left[\left|X-x^{\prime}\right|>t_{i-1}\right] E [ ∣ X − x ′ ∣ ] ≤ a log  ( b )  + i = 1 ∑ ∞  t i  P [ ∣ X − x ′ ∣ > t i − 1  ] Using the assumption in the lemma we have
∑ i = 1 ∞ t i P [ ∣ X − x ′ ∣ > t i − 1 ] ≤ 2 a b ∑ i = 1 ∞ ( i + log  ( b ) ) e − ( i − 1 + log  ( b ) ) 2 ≤ 2 a b ∫ 1 + log  ( b ) ∞ x e − ( x − 1 ) 2 d x = 2 a b ∫ log  ( b ) ∞ ( y + 1 ) e − y 2 d y ≤ 4 a b ∫ log  ( b ) ∞ y e − y 2 d y = 2 a b [ − e − y 2 ] log  ( b ) ∞ = 2 a b / b = 2 a . \begin{array}{rl} \sum_{i=1}^{\infty} t_i & \mathbb{P}\left[\left|X-x^{\prime}\right|>t_{i-1}\right] \leq 2 a b \sum_{i=1}^{\infty}(i+\sqrt{\log (b)}) e^{-(i-1+\sqrt{\log (b)})^2} \\ & \leq 2 a b \int_{1+\sqrt{\log (b)}}^{\infty} x e^{-(x-1)^2} d x \\ & =2 a b \int_{\sqrt{\log (b)}}^{\infty}(y+1) e^{-y^2} d y \\ & \leq 4 a b \int_{\sqrt{\log (b)}}^{\infty} y e^{-y^2} d y \\ & =2 a b\left[-e^{-y^2}\right]_{\sqrt{\log (b)}}^{\infty} \\ & =2 a b / b=2 a . \end{array} ∑ i = 1 ∞  t i   P [ ∣ X − x ′ ∣ > t i − 1  ] ≤ 2 ab ∑ i = 1 ∞  ( i + log  ( b )  ) e − ( i − 1 + l o g ( b )  ) 2 ≤ 2 ab ∫ 1 + l o g ( b )  ∞  x e − ( x − 1 ) 2 d x = 2 ab ∫ l o g ( b )  ∞  ( y + 1 ) e − y 2 d y ≤ 4 ab ∫ l o g ( b )  ∞  y e − y 2 d y = 2 ab [ − e − y 2 ] l o g ( b )  ∞  = 2 ab / b = 2 a .  Combining the preceding inequalities we conclude our proof.